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ABSTRACT 



This booklet, geared toward an advanced high school or early 
college-level audience, explains how structural biology provides insight into 
health and disease and is useful in developing new medications. This 
publication contains a general introduction to proteins, coverage of the 
techniques used to determine protein structures, and a chapter on 
structure -based drug design. The booklet features "Student Snapshots," 
designed to inspire young people to consider careers in biomedical research. 
Review questions at the end of each chapter are also included. Chapter 1 
discusses the "structures of life" and their role in the structure and 
function of all living things. In Chapters 2 and 3, X-ray crystallography and 
nuclear magnetic resonance spectroscopy tools that structural biologists use 
to study the detailed shapes of proteins and other biological molecules are 
described. Chapter 4 explains how the shape of proteins can be used to help 
design new medications, in this case drugs to treat AIDS and arthritis. 
Chapter 5 provides more examples of how structural biology teaches about all 
life processes, including those of humans. (ASK) 
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PREFACE 



magine that you are a scientist probing the secrets 
of living systems not with a scalpel or microscope, 
but much deeper — at the level of single molecules, 
the building blocks of life. You’ll focus on the 
detailed, three-dimensional structure of biological 
molecules. You’ll create intricate models of these 
molecules using sophisticated computer graphics. 

You may be the first 

person to see the shape T ^ . n . i j * . i 

In addition to teaching about our bodies, these 

of a molecule involved 

“structures of life” may hold the key to developing 



offers clues about the role it plays in the body. 

It may also hold the key to developing new 
medicines, materials, or diagnostic procedures. 

In Chapter 1, you’ll learn more about these 
“structures of life” and their role in the structure 
and function of all living things. In Chapters 
2 and 3, you’ll learn about the tools — X-ray 



in health or disease. 

You are part of the 
growing field of 
structural biology. 

The molecules whose shapes most tantalize 
structural biologists are proteins, because these 
molecules do most of the work in the body. 

Like many everyday objects, proteins are shaped 
to get their job done. The structure of a protein 



new medicines, materials, and diagnostic procedures. 



crystallography and nuclear magnetic resonance 
spectroscopy — that structural biologists use 
to study the detailed shapes of proteins and other 
biological molecules. 





Z5. Proteins, like many everyday objects, 
are shaped to get their job done. 
The long neck of a screwdriver allows 
you to tighten screws in holes or pry 
open lids. The depressions in an egg 
carton are designed to cradle eggs 
so they won't break. A funnel's wide 



brim and narrow neck enable the 
transfer of liquids into a container 
with a small opening. The shape 
of a protein — although much more 
complicated than the shape of 
a common object — teaches us 
about that protein's role in the body. 
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Chapter 4 will explain how the shape of proteins 
can be used to help design new medications — in 
this case, drugs to treat AIDS and arthritis. And 
finally, Chapter 5 will provide more examples of 
how structural biology teaches us about all life 
processes, including those of humans. 

Much of the research described in this booklet 
is supported by U.S. tax dollars, specifically those 
awarded by the National Institute of General 
Medical Sciences (NIGMS) to 
scientists at universities across the 
nation. NIGMS supports more 
structural biology than any other 
private or government agency 
in the world. 

NIGMS is also unique among the 
components of the National Institutes of Health 
(NIH) in that its main goal is to support basic 
biomedical research that at first may not be linked 
to a specific disease or body part. These studies 
increase our understanding of life’s most funda- 
mental processes — what goes on at the molecular 
and cellular level — and the diseases that result 
when these processes malfunction. 

Advances in such basic research often lead to 
many practical applications, including new scientific 
tools and techniques, and fresh approaches to 
diagnosing, treating, and preventing disease. 




Structural biology requires the 
cooperation of many different 
scientists, including biochemists, 
molecular biologists, X-ray 
crystallographers, and NMR 
spectroscopists. Although these 



researchers use different techniques 
and may focus on different molecules, 
they are united by their desire 
to better understand biology by 
studying the detailed structure 
of biological molecules. 



Alisa Zapp Machalek 
Science Writer, NIGMS 
November 2000 
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CHAPTER 1 
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f ou’ve probably heard that proteins are 
important nutrients that help you build 
muscles. But they are much more than that. 
Proteins are the worker molecules that make 
possible every activity in your body. They 



circulate in your blood, seep from your tissues, 
and grow in long strands out of your head. 
Proteins are also the key components of biological 
materials ranging from silk fibers to elk antlers. 



Proteins are the worker molecules that 

make possible every activity in your body. 




^ Proteins have many different functions in our bodies. By studying the structures of 
proteins, we are better able to understand how they function normally and how 
some proteins with abnormal shapes can cause disease. 
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(PratteDros Are CWadle From Small! 
BuoDOdons] Bllodks 

Proteins are like long necklaces with differently 
shaped beads. Each “bead” is a small molecule 
called an amino acid. There are 20 standard amino 
acids, each with its own shape, size, and properties. 

Proteins contain from 50 to 5,000 amino acids 
hooked end-to-end in many combinations. Each 
protein has its own sequence of amino acids. 

These amino acid chains do not remain straight 
and orderly. They twist and buckle, folding in upon 
themselves, the knobs of some amino acids nestling 
into grooves in others. 



Only when the protein settles into its final 
shape does it become active. This process is 
complete almost immediately after proteins are 
made. Most proteins fold in less than a second, 
although the largest and most complex proteins 
may require several seconds to fold. Some proteins 
need help from other proteins, called “chaperones,” 
to fold efficiently. 



COO- 
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m 
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Glycine 



Asparagine 



Phenylalanine 



Methionine 




^ Amino acids are like differently shaped "beads” that make up protein “necklaces." 
Shown here are a few examples of the 20 standard amino acids. Each amino acid 
contains an identical backbone structure (in black) and a unique side chain, also 
called an R-group {in red box). The shapes and chemical properties of these side 
chains are responsible for the twists and folds of the protein as well as for the pro- 
tein's biological function. 



4 ! The Structures of Life 



Because proteins have diverse roles in the body, 
they come in many shapes and sizes. 

Studies of these shapes teach us how the proteins 
function in our bodies and help us understand 
diseases caused by abnormal proteins. 





^ Troponin C triggers muscle contraction by chang- 
ing shape. The protein grabs calcium in each of its 
"fists,” then "punches" other proteins to initiate 
the contraction. 



^ Collagen in our cartilage and tendons 
gains its strength from its three-stranded, 
rope-like structure. 
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Some proteins latch onto and regulate the activity 
of our genetic material, DNA. Some of these 
proteins are donut shaped, enabling them to form 
a complete ring around the DNA. Shown here is 
DNA polymerase III, which cinches around DNA 
and moves along the strands as it copies the 
genetic material. 





Many proteins, like the digestive enzyme 
chymotrypsin, are somewhat spherical in shape. 
Enzymes, which are proteins that facilitate 
chemical reactions, often contain a groove or 
pocket to hold the molecule they act upon. 



The examples here are schematic drawings 
based on protein shapes that have been 

determined experimentally. When scientists 



zs. Antibodies are immune system proteins 
that rid the body of foreign material, 
including bacteria and viruses. The two 
arms of the Y-shaped antibody bind to 
a foreign molecule. The stem of the 
antibody sends signals to recruit other 
members of the immune system. 



decipher protein structures, they deposit the 
three-dimensional coordinates into the 
Protein Data Bank, currently available at 
http://www.rcsb.org/pdb/. 




f, 
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Small Errors in Proteins Can Cause Disease 



Sometimes, an error in just one amino acid can 
cause disease. Sickle cell disease, which most 
often affects those of African descent, is caused 
by a single error in the gene for hemoglobin, 
the oxygen-carrying protein in red blood cells. 

This error, or mutation, results in an incorrect 



amino acid at one position in the molecule. 




The most common symptom of the disease 
is unpredictable pain in any body organ or joint, 
caused when the distorted blood cells jam together, 
unable to pass through small blood vessels. These 
blockages prevent oxygen-carrying blood from 
getting to organs and tissues. The frequency, 
duration, and severity of this pain vary greatly 
between individuals. 



The disease affects about 1 in every 500 African 
Americans, and 1 in 12 carry the trait and can pass 
it on to their children, but do not have the disease 
themselves. 

Another disease caused by a defect in one 
amino acid is cystic fibrosis. This disease is most 
common in those of northern European descent, 
affecting about 1 in 9,000 Caucasians in the United 
States. Another 1 in 20 are carriers. 

The disease is caused when a protein called 
CFTR is incorrectly folded. This misfolding is 
usually caused by the deletion of a single amino 
acid in CFTR. The function of CFTR, which stands 
for cystic fibrosis transmembrane conductance 
regulator, is to allow chloride ions (a component 
of table salt) to pass through the outer membranes 
of cells. 

When this function is disrupted in cystic fibrosis, 
glands that produce sweat and mucus are most 
affected. A thick, sticky mucus builds up in the 
lungs and digestive organs, causing malnutrition, 
poor growth, frequent respiratory infections, 
and difficulties breathing. Those with the disorder 
usually die from lung disease around the age of 30. 



Proteins Are the Body's Worker Molecules I 7 



PratieS ms FoDdl Onto SpoiraOs and Sheets 
When proteins fold, they don’t randomly wad up 
into twisted masses. Often, short sections of proteins 
form recognizable shapes such as “alpha helices” 
or “beta sheets.” Alpha helices are spiral shaped 
and beta sheets are pleated structures. Scientists 



devised a stylized method of representing proteins, 
called a ribbon diagram, that highlights helices 
and sheets. These organized sections of a protein 
pack together with each other — or with other, less 
organized sections — to form the final, 
folded protein. 



^ Proteins are made of amino 
acids hooked end-to-end like 
beads on a necklace. 




<\ To become active, proteins 
must twist and fold into their 
final, or "native," conformation. 




4 This final shape enables proteins 
to accomplish their function in 
your body. 
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The P^obDem of Proteisi FoDdimg 

A given sequence of amino acids almost always folds 
into a characteristic, three-dimensional structure. 
So scientists reason that the instructions for folding 
a protein must be encoded within the sequence. 
Researchers can easily determine a proteins amino 
acid sequence. But for 50 years they’ve tried — and 
failed — to crack the code that governs folding. 



“If we could decipher the structures of proteins 

from their sequences, we could better understand 
all sorts of biological phenomena, from cancer to AIDS. 

Then we might be able to do more about 
these disorders.” 



Scientists call this the “protein folding problem,” 
and it remains one of the great challenges in 
structural biology. Although researchers have 
teased out some general rules and, in some cases, 
can make rough guesses of a protein’s shape, they 
cannot accurately and reliably predict a final 
structure from an amino acid sequence. 

The medical incentives for cracking the folding 
code are great. Several diseases — including 
Alzheimer’s, cystic fibrosis, and “mad cow” 
disease — are thought to result from misfolded pro- 
teins. Many scientists believe that if we could 
decipher the structures of proteins from their 
sequences, we could improve the treatment of 
these diseases. 



James Cassatt 

Director, Division of Cell Biology and Biophysics 
National Institute of Genera! Medical Sciences 
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Provocative Proteins 



• There are about 100,000 different proteins 
in your body. 

• Spider webs and silk fibers are made of the 

strong, pliable protein fibroin. Spider 

// silk is stronger than a steel rod 

I yO — v of the same diameter, yet it is 

much more elastic, so scientists 
hope to use it for products as diverse as 
bulletproof vests and artificial joints. The 

I ^ difficult part is harvesting the silk, because 
spiders are much less cooperative than silkworms! 

• The light of fireflies (also called lightning bugs) 
is made possible by a 
protein called luciferase. 

Although most predators 
stay away from the bitter- 
tasting insects, some frogs 
eat so many fireflies that they glow! 




The deadly venoms of cobras, scorpions, 
and puffer fish contain small proteins that act 
as nerve toxins. Some sea snails stun their 
prey (and occasionally, unlucky humans) with 
up to 50 such toxins. Incredibly, 
scientists are looking into 
harnessing these toxins to 
relieve pain that is unrespon- 
sive even to morphine. 





Sometimes ships in the northwest 
Pacific Ocean leave a trail 
of eerie green light. The light 
is produced by a protein in 
jellyfish when the creatures 
are jostled by ships. Because the 
trail traces the path of ships at 
night, this green fluorescent 
protein has interested the Navy 
for many years. Many cell biologists also use it 
to fluorescently mark the cellular components 
they are studying. 

If a recipe calls for rhino horn, ibis feathers, 
and porcupine quills, try substituting your 
own hair or fingernails. It’s all the same 
stuff — alpha-keratin, 
a tough, water-resistant 
protein that is also the 
main component of wool, 
scales, hooves, tortoise shells, 
and the outer layer of your skin 
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High-Tech Tinkertoys * 



Decades ago, scientists who wanted to study a mole- 
cule s three-dimensional structure would have to 
build a large Tinkertoy®-type model out of rods, 
balls, and wire scaffolding. The process was laborious 
and clumsy, and the models often fell apart. 

Today, researchers use computer graphics to 
display and manipulate molecules. They can even 
see how molecules might interact with one another. 

In order to study different aspects of a molecule s 
structure, scientists view the molecule in several 
ways. Below you can see one protein shown in three 
different styles. 

You can try one of these computer graphics pro- 
grams yourself at http://www.proteinexplorer.org. 





^ Ribbon diagrams highlight organized 
regions of the proteins. Alpha helices 
(red) appear as spiral ribbons. Beta sheets 
(aqua) are shown as flat ribbons. 

Less organized areas appear as round 
wires or tubes. 



^ Space-filling molecular models attempt 
to show atoms as spheres whose size 
correlates with the amount of space the 
atoms occupy. For consistency, the same 
atoms are colored red and aqua in this 
model and in the ribbon diagram. 



A surface rendering of the protein shows 
its overall shape and surface properties. 
The red and blue coloration indicates the 
electrical charge of atoms on the protein's 
surface. 
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Richard T. Nowitz 
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Stfractorall Geimomocs: Pmm (Gene to 
Slfcirwctoire, amid! Perhaps Fyimcttoomi 

The potential value of cracking the protein folding 
code increases daily as the Human Genome Project 
amasses vast quantities of genetic sequence infor- 
mation. This government project was established 
to obtain the entire genetic sequence of humans 
and other organisms. From these complete genetic 
sequences, scientists can easily obtain the amino 
acid sequences of all of an organism’s proteins by 
using the “genetic code.” 

The ultimate dream of many structural biologists 
is to determine directly from these sequences not 
only the three-dimensional structure, but also 
some aspects of the function, of all proteins. This 
vision has spurred a new field called structural 
genomics and a collaborative, international effort. 

Groups of scientists have begun to categorize all 
known proteins into families, based on their amino 
acid sequences and a prediction of their rough, 
overall structure. Just as some people can be recog- 
nized as members of a family because they share a 
certain feature — such as a cleft chin or 
long nose — members of a protein family share 
structural characteristics, based on similarities in 
their amino acid sequences. 

Researchers plan to determine the detailed, 
three-dimensional structures of one or more 
representative proteins from each of the families. 
They estimate that the total number of such 
representative structures will be at least 10,000. 



0 




Although the detailed, three-dimensional structure 
of a protein is extremely valuable to show scientists 
what the molecule looks like and how it interacts 
with other molecules, it is really only a "snapshot" 
of the protein frozen in time and space. 

Proteins are not rigid, static objects — they 
are dynamic, rapidly changing molecules 
that move, bend, expand, and contract. 
Scientists are using complex programs 
on ultra-high-speed computers to predict 
and study protein movement. 

Using these 10,000 or so structures as 
a guide, researchers expect to be able to 
use computers to model the structures of 
any other protein. >*> 

Scientists learn much from comparing 
the structures of different proteins. Usually — 
but not always — two similarly shaped proteins have 
similar biological functions. By studying 
thousands of molecules in an organized way 
in this project, researchers will deepen their 
understanding of the relationships between gene 
sequence, protein structure, and protein function. 

In addition to any future medical or industrial 
applications, researchers expect that by studying 
the structure of all proteins from a single organ- 
ism — or proteins from different organisms that 
serve the same physiological function — they will 
learn fundamental lessons about biology. 



' t <> 
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The Genetic Code 



In addition to the protein folding code, which 
remains unbroken, there is another code, a genetic 
code, that scientists cracked in the mid-1960s. 
The genetic code reveals how gene sequences 
correspond to amino acid sequences. 



Genes are made of DNA (deoxyribonucleic 
acid), which itself is composed of small molecules 
called nucleotides connected together in long 
chains. A run of three nucleotides (called a triplet), 
encodes one amino acid. 






Nucleotides 



^ Genes are made up 
of small molecules 
called nucleotides. 
There are four differ- 
ent nucleotides in 
DNA, named for the 
fundamental unit, or 
"base" they contain: 
adenine (A), thymine 
(T), cytosine (C), and 
guanine (G). Thymine 
was first isolated from 
thymus glands, and 
guanine was first 
isolated from guano 
(bird feces). 




Gene 




Amino Acids 



^ Through biochemical processes called transcription ^ Newly synthesized 

and translation, cells make proteins from these proteins fold into 

coded genetic messages. their final shape. 



zx Genes contain any 
number and combi- 
nation of these 
nucleotides. Three 
adjacent nucleotides 
in a gene code for 
one amino acid. 
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UUU phenylalanine 


UCU serine 


UAU tyrosine 


UGU cysteine 


UUC phenylalanine 


UCC serine 


UAC tyrosine 


UGC cysteine 


UUA leucine 


UCA serine 


UAA stop 


UGA stop 


UUG leucine 


UCG serine 


UAG stop 


UGG tryptophan 


CUU leucine 


CCU proline 


CAU histidine 


CGU arginine 


CUC leucine 


CCC proline 


CAC histidine 


CGC arginine 


CUA leucine 


CCA proline 


CAA glutamine 


CGA arginine 


CUG leucine 


CCG proline 


CAG glutamine 


CGG arginine 


AUU iso leucine 


ACU threonine 


AAU asparagine 


AGU serine 


AUC isoleucine 


ACC threonine 


AAC asparagine 


AGC serine 


AUA isoleucine 


ACA threonine 


AAA lysine 


AGA arginine 


AUG methionine (start) 


ACG threonine 


AAG lysine 


AGG arginine 


GUU valine 


GCU alanine 


GAU aspartic acid 


GGU glycine 


GUC valine 


GCC alanine 


GAC aspartic acid 


GGC glycine 


GUA valine 


GCA alanine 


GAA glutamic acid 


GGA glycine 


GUG valine 


GCG alanine 


GAG glutamic acid 


GGG glycine 




z^ The genetic code explains how sets of three 
nucleotides code for amino acids. This code is 
stored in DNA, then transferred to messenger RNA 
(mRNA), from which new proteins are synthesized. 
RNA (ribonucleic acid) is chemically very similar to 
DNA and also contains four chemical letters. But 
there is one major difference: where DNA uses 
thymine (T), mRNA uses uracil (U). 

The table above reveals all possible messenger 
RNA triplets and the amino acids they specify. For 
example, the mRNA triplet UUU codes for the amino 
acid phenylalanine. Note that most amino acids may 
be encoded by more than one mRNA triplet. 



z*. Some proteins are synthesized at a 
constant rate, while others are made 
only in response to the body's need. 
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What is a protein? 

Name three proteins 
in your body and describe 
what they do. 

What is meant by the 
detailed, three-dimensional 
structure of proteins? 

What do we learn from 
studying the structures 
of proteins? 

Describe the protein 
folding problem. 




X-Ray Crystallography: Art Marries Science 



an 



ow would you examine the shape of some- 
thing too small to see in even the most 
powerful microscope? Scientists trying to visualize 
the complex arrangement of atoms within molecules 
have exactly that problem, so they solve it indirectly. 

By using a large collection of identical molecules — 
often proteins — along with specialized equipment 
and computer modeling techniques, scientists are 
able to calculate what an isolated molecule would 
look like. 

The two most common methods used to 
investigate molecular structures are X-ray 
crystallography (also called X-ray diffraction) and 
nuclear magnetic resonance (NMR) spectroscopy. 
Researchers using X-ray crystallography grow solid 
crystals of the molecules they study. Those using 
NMR study molecules in solution. Each technique 
has advantages and disadvantages. Together, they 
provide researchers with a precious glimpse into the 
structures of life. 



About 80 percent of the protein structures that 
are known have been determined using X-ray 
crystallography. In essence, crystallographers aim 
high-powered X-rays at a tiny crystal containing 
trillions of identical molecules. The crystal scatters 
the X-rays onto an electronic detector like a disco 
ball spraying light across a dance floor. The elec- 
tronic detector is the same type used to capture 
images in a digital camera. 

After each blast of X-rays, lasting from a fraction 
of a second to several hours, the researchers 
precisely rotate the crystal by entering its desired 
orientation into the computer that controls the 
X-ray apparatus. This enables the scientists to 
capture in three dimensions how the crystal 
scatters, or diffracts, X-rays. 








X-Ray Beam 



Crystal 



Scattered X-Rays 



Detector 
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The First X-Ray Structure: 
Myoglobin 



A 




The intensity of each diffracted ray is fed into 
a computer, which uses a mathematical equation 
called a Fourier transform to calculate the position 
of every atom in the crystallized molecule. 

The result — the researchers’ masterpiece — is 
a three-dimensional digital image of the molecule. 
This image represents the physical and chemical 
properties of the substance and can be studied in 
intimate, atom-by-atom detail using sophisticated 
computer graphics software. 



The first time researchers glimpsed the complex 
internal structure of a protein was in 1959, when 
John Kendrew, working at Cambridge University, 
determined the structure of myoglobin using 
X-ray crystallography. 

Myoglobin, a molecule similar to but smaller 
than hemoglobin, stores oxygen in muscle tissue. 

It is particularly abundant in the muscles of diving 
mammals such as whales, seals, and dolphins, 
which need extra supplies of oxygen to remain 
submerged for long periods of time. In fact, it is 
up to nine times more abundant in the muscles 
of these sea mammals than it is in the muscles 
of land animals. 



An essential step in X-ray crystallography is 
growing high-quality crystals. The best crystals 
are pure, perfectly symmetrical, three-dimensional 
repeating arrays of precisely packed molecules. 
They can be different shapes, from perfect cubes 
to long needles. Most crystals used for these 
studies are barely visible (less than 1 millimeter 
on a side). But the larger the crystal, the more 
accurate the data and the more easily scientists 

Crystallographers 
grow their tiny crystals 
in plastic dishes. They 
usually start with a 
highly concentrated 
solution containing the 
molecule. They then 
mix this solution with 
a variety of specially 
prepared liquids to 
form tiny droplets 
(1-10 microliters), 
separate plastic dish or 
well. As the liquid evaporates, the molecules in the 
solution become progressively more concentrated. 
During this process, the molecules arrange into 
a precise, three-dimensional pattern and eventu- 
ally into a crystal — if the researcher is lucky. 



can solve the structure. 




Each droplet is kept in a 



Sometimes, crystals require months or even 
years to grow. The conditions — temperature, pH 
(acidity or alkalinity), and concentration — must 
be perfect. And each type of molecule is different, 
requiring scientists to tease out new crystallization 
conditions for every new sample. 

Even then, some molecules just won't cooperate. 
They may have floppy sections that wriggle around 
too much to be arranged neady into a crystal. Or, 
particularly in the case of proteins that are normally 
embedded in oily cell membranes, the molecule 
may fail to completely dissolve in the solution. 
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Some crystallographers keep their growing 
crystals in air-locked chambers, to prevent any 
misdirected breath from disrupting the tiny crystals. 
Others insist on an environment free of vibrations — 
in at least one case, from rock-and-roll music. 

Still others joke about the phases of the moon and 
supernatural phenomena. As the jesting suggests, 
growing crystals remains the most difficult and least 
predictable part of X-ray crystallography. Its what 
blends art with the science. 



Calling All Crystals 



Although the crystals used in X-ray 
crystallography are barely 
visible to the naked 
eye, they contain 
W a vast number of precisely 
^ ordered, identical molecules. A 

crystal that is 0.5 millimeters on each side 
contains around 1,000,000,000,000,000 (or 10 15 ) 
medium-sized protein molecules. 

When the crystals are fully formed, they are 
placed in a tiny glass tube or scooped up with a 
loop made of nylon, human hair, or other material 
depending on the preference of the researcher. 

The tube or loop is then mounted in the X-ray 
apparatus, directly in the path of the X-ray beam. 

The searing force of powerful X-ray beams can 
burn holes through a crystal left too long in their 
path. To minimize radiation damage, researchers 
flash-freeze their crystals in liquid nitrogen. 



Crystal photos courtesy of Alex McPherson. 
University of California, Irvine 
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STUDENT SNAPSHOT 



Science Brought One Student From the Coast 
of Venezuela to the Heart of Texas 




cience is like a roller 
'coaster. You start out 
very excited about what you’re 
doing. But if your experiments 
don’t go well for a while, you 
get discouraged. Then, out of 
nowhere, comes this great data 
and you are up and at it again.” 

That’s how Juan Chang 
describes the nature of science. 

He majored in biochemistry 
and computer science at the 
University of Texas at Austin. 

He also worked in the UT- 
Austin laboratory of X-ray 
crystallographer Jon Robertus. 

Chang studied a protein 

that prevents cells from committing suicide. As a 
sculptor chips and shaves off pieces of marble, the 
body uses cellular suicide, also called “apoptosis,” 
during normal development to shape features like 
fingers and toes. To protect healthy cells, the body 
also triggers apoptosis to kill cells that are geneti- 
cally damaged or infected by viruses. 

By understanding proteins involved in causing 
or preventing apoptosis, scientists hope to control 




the process in special situations — to help treat 
tumors and viral infections by promoting the 
death of damaged cells, and to treat degenerative 
nerve diseases by preventing apoptosis in nerve 
cells. A better understanding of apoptosis may 
even allow researchers to more easily grow tissues 
for organ transplants. 

Chang was part of this process by helping to 
determine the X-ray crystal structure of his protein, 
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“Science is like a roller coaster. You start out very excited 

about what you’re doing. But if your experiments 
don’t go well for a while, you get discouraged. 

Then, out of nowhere, comes this great data 
and you are up and at it again.” 



Juan Chang 

Graduate Student 

Baylor College of Medicine 



which scientists refer to as ch-IAPl. He used 
biochemical techniques to obtain larger quantities 
of his purified protein. The next step will be to 
crystallize the protein, then to use X-ray diffraction 
to obtain its detailed, three-dimensional structure. 

Chang came to Texas from a lakeside town 
on the northwest tip of Venezuela. He first became 
interested in biological science in high school. 
His class took a field trip to an island off the 
Venezuelan coast to observe the intricate ecological 
balance of the beach and coral reef. He was 
impressed at how the plants and animals — crabs, 
insects, birds, rodents, and seaweed — each 
adapted to the oceanside wind, waves, and salt. 

About the same time, his school held a fund 
drive to help victims of Huntington’s disease, an 
incurable genetic disease that slowly robs people 
of their ability to move and think properly. 



The town in which Chang grew up, Maracaibo, is 
home to the largest known family with Huntington s 
disease. Through the fund drive, Chang became 
interested in the genetic basis of inherited diseases. 

His advice for anyone considering a career 
in science is to “get your hands into it” and to 
experiment with work in different fields. He was 
initially interested in genetics, did biochemistry 
research, and is now in a graduate program at 
Baylor College of Medicine. The program combines 
structural and computational biology with molec- 
ular biophysics. He anticipates that after earning 
a Ph.D., he will become a professor at a university. 
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Why X-Rays? 

In order to measure something accurately, you 
need the appropriate ruler. To measure the distance 
between cities, you would use miles or kilometers. 
To measure the length of your hand, you would use 
inches or centimeters. 

Crystallographers measure the distances 
between atoms in angstroms. One angstrom equals 
one ten-billionth of a meter, or 10' 10 m. That’s 



more than 10 million times smaller than 

the diameter of the period at the end of this sentence. 

The perfect “rulers” to measure angstrom 
distances are X-rays. The type of X-rays used 
by crystallographers are approximately 0.5 to 
1.5 angstroms long — just the right size to measure 
the distance between atoms in a molecule. There 
is no better place to generate such X-rays than 
in a synchrotron. 
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SyimcItomltrairD Iffiaidliiaittitoin) — Odd© off tlhie 
Birogjlhitfestf Lngjlhiffs m Eairttlh 

Imagine a beam of light 30 times more powerful 

than the Sun, focused on a spot smaller than the 

head of a pin. It carries the blasting power of a 

meteor plunging through the atmosphere. And 

it is the single most powerful tool available to 

X-ray crystallographers. 



This light, one of the brightest lights on earth, 
is not visible to our eyes. It is made of X-ray 
beams generated in large machines called 
synchrotrons. These machines accelerate electrically 
charged particles, often electrons, to nearly the 
speed of light, then whip them around a huge, 
hollow metal ring. 
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4 When using light to measure an 
object, the wavelength of the light 
needs to be similar to the size of the 
object. X-rays, with wavelengths of 
approximately 0.5 to 1 .5 angstroms, 
can measure the distance between 
atoms. Visible light, with a wave- 
length of 4,000 to 7,000 angstroms, 
is used in ordinary light microscopes 
because it can measure objects the 
size of cellular components. 
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^ The Advanced Photon Source (APS) at Argonne National Laboratory near Chicago 
is a "third-generation" synchrotron radiation facility. Biologists were considered 
parasitic users on the "first-generation" synchrotrons, which were built for 
physicists studying subatomic particles. Now, many synchrotrons, such as the 
APS, are designed specifically to optimize X-ray production and support the 
research of scientists in a variety of fields, including biology. 



Synchrotrons were originally designed for 
use by high-energy physicists studying subatomic 
particles and cosmic phenomena. Other scientists 
soon clustered at the facilities to snatch what the 
physicists considered an undesirable byproduct — 
brilliant bursts of X-rays. 

The largest component of each synchrotron 
is its electron storage ring. This ring is actually 
not a perfect circle, but a many-sided polygon. 

At each corner of the polygon, precisely aligned 
magnets bend the electron stream, forcing it to stay 
in the ring (on their own, the particles would travel 
straight ahead and smash into the ring’s wall). 
Each time the electrons’ path is bent, 
they emit bursts of energy in the form of 
electromagnetic radiation. 

This phenomenon is not unique to electrons or 
to synchrotrons. Whenever any charged particle 
changes speed or direction, it emits energy. The 
type of energy, or radiation, that particles emit 
depends on the speed the particles are going and 
how sharply they are bent. Because particles in 
a synchrotron are hurtling at nearly the speed 
of light, they emit intense radiation, including 
lots of high-energy X-rays. 
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Peering Into Protein Factories 



Ribosomes make the stuff of life. They are the 
protein factories in every living creature, and they 
churn out all proteins ranging from bacterial toxins 
to human digestive enzymes. 

To most people, ribosomes are extremely 
small — tens of thousands of ribosomes would 
fit on the sharpened tip of a pencil. But to a 
structural biologist, ribosomes are huge. They 
contain three or four strands of RNA and more than 
50 small proteins. These many components work 
together like moving parts in a complex machine — a 
machine so large that it has been impossible to study 
in structural detail until recendy. 

In 1999, researchers determined the crystal 
structure of a complete ribosome for the first time. 
This snapshot, although it was not detailed enough 
to reveal the location of individual atoms, did show 
how various parts of the ribosome fit together and 
where within a ribosome new proteins are made. 

As increasingly detailed ribosome structures become 
available, they will show, at an atomic level, how 
proteins are made. 

In addition to providing valuable insights into 
a critical cellular component and process, structural 
studies of ribosomes may lead to clinical applications. 
Many of todays antibiotics work by interfering 
with the function of ribosomes in harmful bacteria 
while leaving human ribosomes alone. A more 
detailed knowledge of the structural differences 
between bacterial and human ribosomes may help 
scientists develop new antibiotic drugs or improve 
existing ones. 




z^The first structural snapshot of an entire bacterial 
ribosome. The structure, which is the largest deter- 
mined by X-ray crystallography to date, will help 
researchers better understand the fundamental 
process of protein production. It may also aid 
efforts to design new antibiotic drugs or optimize 
existing ones. 

Ribosome structure courtesy of Jamie Cate. Marat Yusupov, 

Gulnara Yusupova, Thomas Earnest, and Harry Nolier. Graphic 
courtesy of Albion Baucom. University of California, Santa Cruz. 



The work was also a technical triumph for 
crystallography. The ribosome was much larger 
than any other irregular structure previously 
determined. (Some equally large virus structures 
have been obtained, but the symmetry of these 
structures greatly simplified the process.) Now that 
the technique has been worked out, researchers 
are obtaining increasingly detailed pictures of the 
ribosome — ones in which they can pinpoint 
every atom. 
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Scoeiroftiistts Gett aft ttlhe 
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Synchrotrons are prized not only for their ability to 
generate brilliant X-rays, but also for the 
“tun ability” of these rays. Scientists can actually 
select from these rays just the right wavelength for 
their experiments. 

In order to determine the structure of a mole- 
cule, crystallographers usually have to compare 
several versions of a crystal — one pure crystal 
and several others in which the crystallized mole- 
cule is soaked in, or “doped” with, a different heavy 
metal, like mercury, platinum, or uranium. 



Because these heavy metal atoms contain many 
electrons, they scatter X-rays more than do the 
smaller, lighter atoms found in biological molecules. 
By comparing the X-ray scatter patterns of a pure 
crystal with those of vari- 
ous metal-containing 
crystals, the researchers 
can determine the location 
of the metals in the crystal. 

These metal atoms serve as 
landmarks that enable researchers 
to calculate the position of every 
other atom in the molecule. 






^ There are half a dozen major synchrotrons used for X-ray crystallography 
in the United States. 



But when using X-ray radiation from the syn- 
chrotron, researchers do not have to grow multiple 
versions of every crystallized molecule — a huge 
savings in time and money. Instead, they grow only 
one type of crystal which contains the chemical 
element selenium instead of sulfur in every methio- 
nine amino acid. They then “tune” the wavelength 
of the synchrotron beam to match certain properties 
of selenium. That way, a single crystal serves the 
purpose of several different metal-containing 
crystals. This technique is called MAD, for Multi- 
wavelength Anomalous Diffraction. 

Using MAD, the researchers bombard the 
selenium-containing crystals three or four different 
times, each time with 
X-ray beams of a 
different wavelength — 
including one blast with X-rays 
of the exact wavelength absorbed 
by the selenium atoms. A comparison 
of the resulting diffraction patterns enables 
researchers to locate the selenium atoms, which 
again serve as markers, or reference points, around 
which the rest of the structure is calculated. 

The brilliant X-rays from synchrotrons allow 
researchers to collect their raw data much more 
quickly than when they use traditional X-ray 



sources, which are small enough to fit on a long 
laboratory table and produce much weaker 
X-rays than do synchrotrons. What used to take 
weeks or months in the laboratory can be done 
in minutes at a synchrotron. But then the data 
still must be analyzed by computers and the sci- 
entists, refined, and corrected before the protein 
can be visualized in its three-dimensional 
structural splendor. 

The number and quality of molecular struc- 
tures determined by X-ray diffraction has risen 
sharply in recent years, as has the percentage of 
these structures obtained using synchrotrons. 
This trend promises to continue, due in large 
part to new techniques like MAD and to the 
matchless power of synchrotron radiation. 

In addition to revealing the 
atomic architecture of biological 
molecules, synchrotrons are used by 
the electronics industry to develop new 
computer chips, by the petroleum industry 
to develop new catalysts for refining crude oil 
and to make byproducts like plastics, and in 
medicine to study progressive bone loss. 

Crystal photos courtesy of Alex McPherson, 

University of California, Irvine 





What is X-ray 
crystallography? 



Give two reasons 
why synchrotrons are 
so valuable to X-ray 
crystallographers. 



What is a ribosome 
and why is it important 
to study? 
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CHAPTER 3 









ost atoms in biological molecules have 
a little magnet inside them. If we put any 
of these molecules in a big magnet, all the little 
magnets in the molecule will orient themselves 
to line up with the big magnet,” allowing scientists 
to probe various properties of the molecule. That’s 
how Angela Gronenborn describes the technique 
of nuclear magnetic resonance spectroscopy, 
or NMR. Gronenborn is a researcher at the 
National Institutes of Health who uses NMR 
to determine the structure of proteins involved 
in HIV infection, in the immune response, and 
in “turning on” genes. 




Next to X-ray diffraction, NMR is the most 
common technique used to determine detailed 
molecular structures. This technique, which has 
nothing to do with nuclear reactors or nuclear 
bombs, is based on the same principle as the 
magnetic resonance imaging (MRI) machines that 
allow doctors to see tissues and organs such as the 
brain, heart, and kidneys. 

Although NMR is used for a variety of medical 
and scientific purposes — including determining 
the structure of genetic material (DNA and RNA), 
carbohydrates, and other molecules — in this booklet 
we will focus on using NMR to determine the 
structure of proteins. 



Currently, NMR spectroscopy is only able to determine 
the structures of small and medium-sized proteins. 
Shown here is the largest structure determined by 
X-ray crystallography (the ribosome) compared to 
one of the largest structures determined by NMR 
spectroscopy. 

Ribosome structure courtesy of Jamie Cate, Marat Yusupov, 

Gulnara Yusupova. Thomas Earnest, and Harry Noller. Graphic 
courtesy of Albion Baucom, University of California, Santa Cruz. 
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Methods for determining structures by NMR 
spectroscopy are much younger than those that 
use X-ray crystallography. As such, they are 
constantly being refined and 
improved. “NMR structure deter- 
mination is still an evolving 
field,” says Gronenborn. “Yes, 
we’re 20 years behind X-ray 
crystallography, but it’s very 
exciting. There are new discoveries 
and techniques every year. This 
should be really interesting for 
young people going into science.” 

The most obvious area in which NMR lags 
behind X-ray crystallography is the size of the 
structures it can handle. The largest structures 
NMR spectroscopists have determined are 30 
to 40 kilodaltons (270 to 360 amino acids). X-ray 
crystallographers have solved rough structures 
of up to 2,500 kilodaltons — 60 times as large. 




NMR structure determination is still an evolving field. 

Yes, we’re 20 years behind X-ray crystallography, 
but it’s very exciting. There are new discoveries and techniques 
every year. This should be really interesting for young 
people going into science,” says Gronenborn. 



But NMR also has advantages over crystallog- 
raphy. For one, it uses molecules in solution, 
so it is not limited to those that crystallize well. 
(Remember that crystallization is often the most 
uncertain and time-consuming step in X-ray 
crystallography.) 

NMR also makes it fairly easy to study proper- 
ties of a molecule besides its structure — such 
as the flexibility of the molecule and how it interacts 
with other molecules. With crystallography, it 
is often either impossible to study these aspects 
or it requires an entirely new crystal. Using NMR 
and crystallography together gives researchers 
a more complete picture of a molecule and its 
functioning than either tool alone. 
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NMR relies on the interaction between an 
applied magnetic field and the natural “little 
magnets” in certain atomic nuclei. For protein 
structure determination, spectroscopists concentrate 
on the atoms that are most common in proteins, 
namely hydrogen, carbon, and nitrogen. 

Before the researchers begin to determine a 
protein’s structure, they already know its amino 
acid sequence — the names and order of all of its 
amino acid building blocks. What they seek to 
learn through NMR is how this chain of amino 
acids wraps and folds around itself to create the 
three-dimensional, active protein. 

Solving a protein structure using NMR is like 
a good piece of detective work. The researchers 
conduct a series of experiments, each of which 
provides partial clues about the nature of the 



atoms in the sample molecule — such as how close 
two atoms are to each other, whether these atoms 
are physically bonded to each other, or where the 
atoms lie within the same amino acid. Other 
experiments show links between adjacent amino 
acids or reveal flexible regions in the protein. 

The challenge of NMR is to employ several sets 
of such experiments to tease out properties unique 
to each atom in the sample. Using computer pro- 
grams, NMR spectroscopists can get a rough idea 
of the proteins overall shape and can see possible 
arrangements of atoms in its different parts. Each 
new set of experiments further refines these possible 
structures. Finally, the scientists carefully select 20 to 
40 solutions that best represent their experimental 
data and present the average of these solutions as 
their final structure. 
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Only certain forms, or isotopes, of each chemical 
element have the correct magnetic properties 
to be useful for NMR. Perhaps the most familiar 
isotope is 14 C, which is used for archeological and 
geological dating. 

You may also have heard about isotopes in the 
context of radioactivity. Neither of the isotopes 
most commonly used in NMR, namely 13 C and 15 N, 
is radioactive. 





Like many other biological scientists, NMR 
spectroscopists (and X-ray crystallographers) use 
harmless laboratory bacteria to produce proteins 
for their studies. They insert into these bacteria 
the gene that codes for the protein under study. 
This forces the bacteria^yvhich grow and multiply 
in swirling flasks, to produce large amounts of 
tailor-made proteins. 
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To generate proteins that are "labeled" with the 
correct isotopes, NMR spectroscopists put their 
bacteria on a special diet. If the researchers 
want proteins labeled with 13 C, for example, the 
bacteria are fed food containing 13 C. That way, 
the isotope is incorporated into all the proteins 
produced by the bacteria. 
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NMR Magic Is in the Magnets 



The magnets used for NMR are incredibly strong. 

Most range in strength from 500 megahertz 
(11.7 tesla) to 800 megahertz (18.8 tesla). That's 
hundreds of times stronger than the magnetic field 
on Earth’s surface. Researchers are always eager for 
ever-stronger magnets because these give NMR 
more sensitivity and higher resolution. 

While the sample is exposed to a strong magnetic 
field, outside most NMR magnets used in structure 
determination, the field is fairly weak. If you stand 
next to a very powerful NMR magnet, the most you 
may feel is a slight tug on hair clips or zippers. But 
do not bring your watch or wallet — NMR magnets 
are notorious for stopping analog watches and 
erasing the magnetic strips on credit cards. 

NMR magnets are superconductors, so they 
must be cooled with liquid helium, which is kept at 
4 Kelvin (-452 degrees Fahrenheit). Liquid nitrogen, 

which is kept at 77 Kelvin (-321 degrees Fahrenheit), ^ Most spectroscopists use magnets that are 500 megahertz to 800 megahertz. 

This magnet is 900 megahertz — the strongest one available. 

helps keep the liquid helium cold. 
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The IMaimv BSmeinisiioras off NMR 
To begin a series of NMR experiments, researchers 
insert a slender glass tube containing about a half 
a milliliter of their sample into a powerful, specially 
designed magnet. The natural magnets in the 
samples atoms line up with the NMR magnet 
just as iron filings line up with a toy magnet. 

The researchers then blast the sample with a series 
of split-second radio wave pulses that disrupt this 
magnetic equilibrium in the nuclei of selected atoms. 

By observing how these nuclei react to the radio 
waves, researchers can assess their chemical nature. 
Specifically, researchers measure a property of the 
atoms called chemical shift. 



Every type of NMR-active atom in the protein 
has a characteristic chemical shift. Over the years, 
NMR spectroscopists have discovered characteristic 
chemical shift values for different atoms (for 
example, the carbon in the center of an amino 
acid, or its neighboring nitrogen), but the exact 
values are unique in each protein. Chemical shift 
values depend on the local chemical environment 
of the atomic nucleus, such as the number and type 
of chemical bonds between neighboring atoms. 

The pattern of these chemical shifts is displayed 
as a series of peaks on a computer screen. This one- 
dimensional NMR spectrum usually contains clusters 
of overlapping peaks, making it nearly impossible 
for scientists to analyze the information it contains. 




^ This one-dimensional NMR spectrum shows the 
chemical shifts of hydrogen atoms in a protein 
from streptococcal bacteria. Each peak corresponds 
to one or more hydrogen Sfdjns in the molecule. 

Spectrum courtesy of Ramon Campos-Olivas, National Institutes of Health 
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The higher the peak, the more hydrogen atoms it 
represents. The position of the peaks on the horizontal 
axis shows how much energy is required to align 
those hydrogens with the magnetic field. 
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To determine protein structures, NMR spectros- 
copists use a technique called multi-dimensional 
NMR. This technique combines several sets of 
experiments, which spreads out the data into 
discrete spots. The location of each spot indicates 
unique properties of one atom in the sample. 

The researchers must then label each spot with 
the identity of the atom to which it corresponds. 

For a small to medium-sized protein, accurately 
assigning each spot to a particular atom in the protein 
molecule may take 3 to 6 months — even with some 
help from computers. For a large, complex protein, 
it could take up to a year. 
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Each NMR experiment is composed of hundreds 
of radio wave pulses, with each pulse up to 
a few milliseconds after the previous one. 
Scientists enter the experiment they'd like to run 
into a computer, which then precisely times the 
pulses it sends to the sample and collects the 
resulting data. 

This process can require as little as 20 minutes 
for a single, simple experiment. For a complex 
molecule, data collection could take weeks 
or months. 




To better understand multi-dimensional NMR, 



we can think of an encyclopedia. If all the words 
in the encyclopedia were condensed into one 



dimension, the result would be a single, illegible 
line of text blackened by countless overlapping letters. 



Expand this line to two dimensions — a page — and 
you still have a jumbled mess of superimposed 
words. Only by expanding into multiple volumes 
is it possible to read all the information in the 
encyclopedia. In the same way, more complex 



NMR studies require experiments in three or 
four dimensions to clearly solve the problem. 



NMR's radio wave pulses are quite tame 
compared to the high-energy X-rays used in 
crystallography. In fact, if an NMR sample is 
prepared well, it should be able to last "forever, 
says Gronenborn, allowing the researchers to 
conduct further studies on the same sample 
at a later time. 
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Spectroscopasts Get 1MGESY 
for Straetyres 

To determine the arrangement of the atoms in the 
molecule, the scientists use a multi-dimensional 
NMR technique called NOESY (pronounced a nosy”) 
for Nuclear Overhauser Effect Spectroscopy. 

This technique works best on the nuclei of 
hydrogen atoms, which have the strongest NMR 
signal and are the most 
common atomic nuclei 
in biological systems. They are 
also the simplest — each hydrogen 
nucleus contains just a single proton. 

The NOESY experiment reveals how close 
different protons are to each other in space. A pair 
of protons very close together (typically within 3 
angstroms) will give a very strong NOESY signal. 
More separated pairs of protons will give weaker 
signals, out to the limit of detection for the tech- 
nique, which is about 6 angstroms. 

From there, the scientists (or, to begin with, 
their computers) must determine how the atoms 
are arranged in space. It’s like solving a complex, 
three-dimensional puzzle with thousands of pieces. 



A Detailed! Stfractfoaire: Just the 
SegoDDDDorngj 

Although a detailed, three-dimensional structure 
of a protein is extremely valuable to show scientists 
what the molecule looks like, it is really only a static 
“snapshot” of the protein frozen in one position. 
Proteins themselves are not rigid or static — they 
are dynamic molecules that can partially unravel, 



fold more tightly, or change shape in response to 
their environment. Some proteins even remain 
partially unfolded until they bind to their biological 
target. NMR researchers can explore some of these 
internal molecular motions by altering the solvent 
used to dissolve the protein. 

A three-dimensional NMR structure often 
merely provides the framework for more in-depth 
studies. After you have the structure, you can easily 
probe features that reveal the molecule’s role 
and behavior in the body, including its flexibility, 
its interactions with other molecules, and how 
it reacts to changes in temperature, acidity, and 
other conditions. 



I believe that structure is really a beginning and not 
an end of studying a molecule,” said Gronenborn. 
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Untangling Protein Folding 



A hundred billion years — that’s the time scientists 
estimate it could take for a small protein to fold 
randomly into its active shape. But somehow, 
Nature does it in a tenth of a second. 

Understanding how proteins fold so 
quickly and correctly (most of the time) 
is more than just a scientific challenge. 

Dozens of diseases are known or 
suspected to result from misfolded 
proteins. In addition, one of the greatest 
challenges for the biotechnology industry 
is to coax bacteria into making vast quan- 
tities of properly folded human proteins. 

NMR is unsurpassed in its ability 
to teach scientists about how proteins fold. 

Most proteins start out like a loose string flopping 
around in a lake, possibly with short coiled sec- 
tions. The molecules contort quickly into various 
partially folded states before congealing into their 
final form. Because the process is so fast, scien- 
tists cannot study it directly. Instead, they reverse 
and interrupt the process. 

Scientists can force a protein to unfold by 
increasing the acidity of, raising the temperature of, 
or adding certain molecules to its liquid environ- 
ment. By capturing a protein in different stages of 
unraveling, researchers hope to understand how 
proteins fold normally. 



H. Jane Dyson and Peter Wright, a husband- 
and-wife team of NMR spectroscopists at the 
Scripps Research Institute in La Jolla, California, 
used this technique to study myoglobin in various 
folding states. 






Completely Folded 



Least Flexible 



^ Myoglobin, a small molecule that stores oxygen in muscle 
tissue, is an ideal protein for studying the structure and 
dynamics of protein folding. It quickly folds into a compact, 
alpha-helical structure. Dyson and Wright used changes in 
acidity to reveal which regions are most flexible in different 
folding states. The first two "structures" show one of many 
possible conformations for a floppy, partially folded molecule. 

Adapted with permission from Nature Structural Biology 1998, 5:499-503 

Most proteins fold almost immediately after 
they are made. Some do not fold completely 
until they contact a target molecule. Others must 
partially unfold to cross a cell membrane, then 
refold on the other side. This last group includes 
the hundreds of proteins that leave their parent 
cell to circulate in the bloodstream — hormones, 
blood clotting factors, and immune system proteins. 

Studies of protein folding provide valuable insight 
into these basic life processes. 
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The Sweetest Puzzle 









etting a protein structure 
using NMR is a lot of fun,” 
says Chele DeRider, a graduate 
student at the University of 
Wisconsin-Madison. “You’re given 
all these pieces to a puzzle and you 
have to use a set of rules, common 
sense, and intuitive thinking to put 
the pieces together. And when you 
do, you have a protein structure.” 

DeRider is working at UW- 
Madison s national NMR facility. 

She is refining the structure of 
brazzein, a small, sweet protein. 

Most sweet-tasting molecules are 
sugars, not proteins; so brazzein 
is quite unusual. It also has other 
remarkable properties that make it 
attractive as a sugar substitute. It is 2,000 times 
sweeter than table sugar — with many fewer 
calories. And, unlike aspartame (NutraSweet®), 
it stays sweet even after 2 hours at nearly boiling 
temperatures. 




In addition to its potential impact in the 
multimillion-dollar market of sugar substitutes, 
brazzein may teach scientists how we perceive 
some substances as sweet. Researchers know 
which amino acids in brazzein are responsible 
for its taste — changing a single one can either 
enhance or eliminate this flavor — but they are 
still investigating how these amino acids react 
with tongue cells to trigger a sensation of sweetness. 
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“Getting a protein structure using NMR is a lot of fun 
You start out with just dots on a page 

and you end up with a protein structure . 55 



Chele DeRider 

Graduate Student 

University of Wisconsin-Madison 




DeRider became interested in NMR as an 
undergraduate student at Macalester College in 
St. Paul, Minnesota. She was studying organic 
chemistry, but found that she spent most of her 
time running NMR spectra on her compounds. 
“I realized that’s what I liked most about my 
research,” she says. 



After she finishes her graduate work, 
DeRider plans to obtain a postdoctoral fellow- 
ship to continue using NMR to study protein 
structure and then to teach at a small college 
similar to her alma mater. 




z\ The plum-sized berries of this African plant 
contains brazzein, a small, sweet protein. 



Give one advantage and 
one disadvantage of NMR 
when compared to X-ray 
crystallography. 



What do NMR spectros- 
copists learn from a 
NOESY experiment? 



Why is it important to 
study protein folding? 
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CHAPTER 4 



Structure-Based Drug Design : From the Computer to the Clinic 
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n 1981, doctors recognized a strange new 
J disease in the United States. The first handful 
of patients suffered from unusual cancers and 
pneumonias. As the disease spread, scientists 
discovered its cause — a virus that attacks human 



immune cells. Now a major killer worldwide, 



the disease is best known by its acronym, AIDS. 



Formally called acquired immunodeficiency 
syndrome, AIDS is caused by the human 
immunodeficiency virus, or HIV. 

Although researchers have not found a cure 
for AIDS, structural biology has greatly enhanced 
their understanding of HIV and has played a key 
role in the development of drugs to treat this 
deadly disease. 




Coat proteins on the 
viral surface bind to 
receptor molecules on 
a human immune cell 

This tricks the cell into 
engulfing the virus 
particles 

Some researchers 
hope to prevent this 
binding so HiV never 
enters the human cell 




HIV was quickly recognized as a retrovirus, 
a type of virus that carries its genetic material 
not as DNA, as do most other organisms on 
the planet but as RNA that the virus then 
"reverse transcribes" into DNA. 

Long before anyone had heard of HIV, 
researchers in labs all over the world studied 
retroviruses, some of which were known to 
cause cancers in animals. These scientists 
traced out the life cycle of retroviruses and 
identified the key proteins and enzymes the 
viruses use to infect cells. 

When HIV was identified as a retrovirus, 
the work of these scientists gave AIDS 
researchers an immediate jump-start. 

The viral proteins they had already 
identified became initial drug targets. 
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Targets of 
Current Drugs: 

Reverse Transcriptase 

Protease 



Receptor 

Molecule 




HIV Particle 

(enlarged to show detail) 



Human Immune Cell 



The virus incorporates its genetic materia 
into the human cell's DNA 

Some sctsrTr^:: ■.* 

to 
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The viral protein 
strands and RNA 
are assembled 
into immature 
"daughter" virus 
particles that 
bud off from 
the cell 




Mature virus particles are 
able to attack other 
human immune cells 





HIV protease chops the viral 
protein strands into separate 
proteins, causing the "daugh- 
ter" virus particles to mature 
into infectious particles 

HIV protease inhibitors 
block this step 



The cell's normal machinery churns out 
viral RNA and long viral protein strands 



Human Cell Nucleus 



t 
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Revealing the Target 

Our story begins in 1989, when scientists determined 
the X-ray crystallographic structure of HIV 
protease, a viral enzyme critical in HIV's life cycle. 
Pharmaceutical scientists hoped that by blocking 
this enzyme, they could prevent the virus from 
spreading in the body. 




With the structure of HIV protease at their 
fingertips, researchers were no longer working 
blindly. They could finally see their target 
enzyme — in exhilarating, color-coded detail. 

By feeding the structural information into a 
computer modeling program, they could spin 
a model of the enzyme around, zoom in on 
specific atoms, analyze its chemical properties, 
and even strip away or alter parts of it. 

Most importantly, they could use the computer- 
ized structure as a reference to determine the types 
of molecules that might block the enzyme. These 
molecules can be retrieved from chemical libraries 
or can be designed on a computer screen and then 
synthesized in a laboratory. Such structure-based 
drug design strategies have the potential to shave 
off years and millions of dollars from the tradition- 
al trial-and-error drug development process. 



a HIV protease is a symmetrical molecule with two equai halves and an active 
site near its center. 

Molecular models of HIV protease in this chapter were generated by Alisa Zapp Machalek 
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These strategies worked in the case of HIV 
protease inhibitors. “I think it’s a remarkable 
success story,” says Dale Kempf, a chemist involved 
in the HIV protease inhibitor program at Abbott 
Laboratories. “From the identification of HIV 
protease as a drug target in 1988 to early 1996, 
it took less than 8 years to have three drugs on 
the market.” Typically, it takes at least $500 million 
and 15 years to develop a drug from scratch. 

The structure of HIV protease revealed 
a crucial fact — like a butterfly, the 
enzyme is made up of two equal 
halves. For most such symmetrical 
molecules, both halves have a “business 
area,” or active site, that carries out the 
enzyme’s job. But HIV protease has only 
one such active site — in the center of the 
molecule where the two halves meet. 

Pharmaceutical scientists knew they could take 
advantage of this feature. If they could plug this 
single active site with a small molecule, they could 
shut down the whole enzyme — and theoretically 
stop the virus’ spread in the body. 
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iK 

Natural Substrate Molecules 




Initial Lead Compound 



a. K nowing that HIV protease has two symmetrical 
halves, pharmaceutical researchers initially attempted 
to block the enzyme with symmetrical small molecules. 
They made these by chopping in half molecules of 
the natural substrate, then making a new molecule 
by fusing together two identical halves of the natural 
substrate. 



Several pharmaceutical companies started out by 
using the enzyme’s shape as a guide. “We designed 
drug candidate molecules that had the same two- 
fold symmetry as HIV protease says Kempf. 
“Conceptually, we took some of the enzymes natural 
substrate [the molecules it acts upon], chopped 
these molecules in half, rotated them 180 degrees, 
and glued two identical halves together” 

To the researchers’ delight, the first such 
molecule they synthesized fit perfectly into the 
active site of the enzyme. It was also an excellent 
inhibitor — it prevented HIV protease from func- 
tioning normally. But it wasn’t water-soluble, 
meaning it couldn’t be absorbed by the body 
and would never be effective as a drug. 

Abbott scientists continued to tweak the struc- 
ture of the molecule to improve its properties. They 
eventually ended up with a nonsymmetrical mole- 
cule they called Norvir® (ritonavir). 




v ' 
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Solubility 

Affects how well the drug 
candidate can be absorbed 
by the body if taken orally 



lutliniiiTTTiD 



Oral Bioavailability 

How much drug candidate 
reaches the appropriate 
tissue(s) in its active form 
when given orally 



^ A drug candidate molecule must pass many hurdles to earn the description 
"good medicine." It must have the best possible activity, solubility, bioava liability, 
half-life, and metabolic profile. Attempting to improve one of these factors 
often affects other factors. For example, if you structurally alter a lead com- 
pound to improve its activity, you may also decrease its solubility or shorten 
its haif-iife. The final resuit must aiways be the best possible compromise. 





42 1 The Structures of Life 



Structure-Based Drug Design: Blocking the Lock 



Traditionally, scientists identify new drugs either by 
fiddling with existing drugs or by testing thousands 
of compounds in a laboratory. If you think of the 
target molecule — HIV protease in this case — as a 
lock, this approach is rather like trying to design a 
key perfectly shaped to the lock if you're given an 
armload of tiny metal scraps, glue, and wire cutters. 

Using a structure-based strategy, researchers 
have an initial advantage. With molecular modeling 
software, they can make a “mold” of the lock and of 
the natural molecule, called a substrate, that fits 
into the lock and opens the door to viral replica- 
tion. The goal is to plug the lock by finding a small 
molecule that fits inside HIV protease and prevents 
the natural substrate from entering. 

Knowing the exact three-dimensional shape 
of the lock, scientists can discard any of the metal 
scraps (small molecules) that are not the right size 
or shape to fit the lock. They might even be able 
to design a small molecule to fit the lock precisely. 
Such a molecule may be a starting point — a lead 
compound — for pharmaceutical researchers who 
are designing a drug to treat HIV infection. 

Of course, biological molecules are much more 
complex than locks and keys, and human bodies 
can react in unpredictable ways to drug molecules, 
so the road from the computer screen to pharmacy 
shelves remains long and bumpy. 



►Traditional drug design often 
requires random testing of 
thousands — if not hundreds 
of thousands — of compounds 
(shown here as metal scraps) 




►By knowing the shape and 
chemical properties of the 
target molecule, scientists 
using structure-based 
drug design strategies 
can approach the job 
more "rationally." 

They can discard 
the drug candidate 
molecules that have 
the wrong shape 
or properties. 
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Clinical Trials: Testing on humans is still 
one of the most time-consuming parts 
of drug development and one that is not 
accelerated by structural approaches 



IHUMItH lllTD 






44 I The Structures of Life 



A Hope for the Future 

Between December 1995 and March 1996, 
the Food and Drug Administration approved 
the first three HIV protease inhibitors — 
Hoffman-La Roche’s Invirase™ (saquinavir), 
Abbott’s Norvir™ (ritonavir), and Merck and 
Co., Inc.’s Crixivan® (indinavir). Initially, these 
drugs were hailed as the first real hope in 15 years 
for people with AIDS. Newspaper headlines 
predicted that AIDS might even be cured. 

Although HIV protease inhibitors did not 
become the miracle cure many had hoped for, 
they represent a triumph for antiviral therapy. 
Antibiotics that treat bacterial diseases abound 
(although they are becoming less effective as 
bacteria develop resistance), but doctors have 
very few drugs to treat viral infections. 



Protease inhibitors are also noteworthy because 
they are a classic example of how structural biology 
can enhance traditional drug development. “They 
show that with some ideas about structure and 
rational drug design, combined with traditional 
medicinal chemistry, you can come up with potent 
drugs that function the way they’re predicted to,” 
says Kempf. 

“That doesn’t mean we have all the problems 
solved yet,” he continues. “But clearly these 
compounds have made a profound impact on 
society.” The death rate from AIDS went down 
dramatically after these drugs became available. 
Now protease inhibitors are often prescribed with 
other anti-HIV drugs to create a “combination 
cocktail” that is more effective at squelching 
the virus than are any of the drugs individually. 





HIV produces many 
different versions of 
itself in a patient's body 
{although the huge 
majority are the normal 
form) 



Drugs kill all of these 
virus particles except 
those that are resistant 
to the drugs 



The resistant virus 
particles continue to 
reproduce. Soon the 
drug is no longer 
effective for the patient. 
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Homing in on Resistance 

HIV is a moving target. When it reproduces inside 
the body, instead of generating exact replicas of 
itself, it churns out a variety of slightly altered 
daughter virus particles. Some of these mutants 
are able to evade, or “resist,” the effects of a drug — 
and can pass that resistance on to their own 
daughter particles. While most virus particles 
initially succumb to the drug, these resistant mutants 
survive and multiply. Eventually, the drug loses its 
anti-HIV activity, because most of the virus particles 
in the infected person are resistant to it. 

Some researchers now are working on 
new generations of HIV protease inhibitors that 
are designed to combat specific drug-resistant 
viral strains. 




^ Scientists have identified dozens of mutations 
(shown in red) that allow HIV protease to escape 
the effects of drugs. The protease molecules in 
some drug-resistant HIV strains have two or three 
such mutations. To outwit the enzyme's mastery 
of mutation, researchers are designing drugs that 
interact specifically with amino acids in the enzyme 
that are critical for the enzyme's function. This 
approach cuts off the enzyme's escape routes. 

As a result, the enzyme — and thus the entire virus — 
is forced to succumb to the drug. 



Detailed, computer-modeled pictures of HIV 
protease from these strains reveal how even amino 
acid substitutions far away from the enzyme’s active 
site can produce drug resistance. Some research 
groups are trying to beat the enzyme at its own game 
by designing drugs that bind specifically to these 
mutant amino acids. Others are designing mole- 



cules that latch onto the enzymes Achilles’ heels - 
the aspartic acids in the active site and other 
amino acids that, if altered, would render 
the enzyme useless. Still others are trying to 
discover inhibitors that are more potent, more 
convenient to take, have fewer side effects, or are 
better able to combat mutant strains of the virus. 




% 
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STUDENT SNAPSHOT 



The Fascination of Infection 



really like to study retroviruses,” 
says Kristi Pullen, who majored 
in biochemistry at the University 
of Maryland, Baltimore County 
(UMBC). “I also like highly infectious 
agents, like Ebola. The more virulent 
something is, the less its worked on, 
so it opens up all sorts of fascinating 
questions. I couldn’t help but be 
interested.” 

In addition to her UMBC class- 
work, Pullen helped determine the 
structure of retroviruses in the NMR 
spectroscopy laboratory of Michael 
Summers. This research focuses on 
how retroviruses package “RNA 
warheads” that enable them to 
spread in the body. Eventually, the 
work may reveal a new drug target 
for retroviral diseases, including AIDS. 



4 * 
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“Working in Dr. Summers’ lab and other labs teaches you that 

research can be fun. It’s not just a whole lot of people 

in white coats. We went biking and skiing together. 

All the people were great to work with.” 

Kristi Pullen 

Graduate Student 

University of California, Berkeley 



Until her senior year in high school, Pullen 
wanted to be an orthopedic surgeon. But after 
her first experience working in a lab, she recognized 
“there’s more to science than medicine.” Then, 
after taking some science courses, she realized 
she had an inner yearning to learn science and 
to work in a lab. 

Pullen is now a graduate student at the 
University of California, Berkeley in the Department 
of Molecular and Cell Biology. She plans to continue 



studying structural biology, to earn a Ph.D., and 
possibly also to earn an M.D. 

She also has some longer-term goals. 
“Ultimately what I want to do way, way, way 
down the line is head the NIH [National Institutes 
of Health] or CDC [Centers for Disease Control 
and Prevention] and in that way affect the health 
of a large number of people — the whole country.” 
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While the HIV protease inhibitors are classic 
examples of structure-based drug design, they 
are also somewhat unusual — at least for now. 
Although many pharmaceutical companies have 
entire divisions devoted to structural biology, 




most use it as a complementary approach, in 
partnership with other, more traditional, means 
of drug discovery. In many cases, the structure 
of a target molecule is determined after traditional 
screening, or even after a drug is on the market. 

This was the case for Celebrex®, a drug marketed 
by the Searle pharmaceutical company. Celebrex® 
was initially designed to treat osteoarthritis and 
adult rheumatoid arthritis, but it is now the first 
drug approved to treat a rare condition called FAP, 
for familial adenomatous polyposis, that leads to 
colon cancer. 

Normally, the pain and swelling of arthritis 
are treated with drugs like aspirin or Advil® 
(ibuprofen), the so-called NSAIDs, or non-steroidal 
anti-inflammatory drugs. But these medications 
can cause damage to gastrointestinal organs, 
including bleeding ulcers. In fact, a recent study 
found that such side effects result in more than 
100,000 hospitalizations and 16,500 deaths every 
year. According to another study, if these side 
effects were included in tables listing mortality 
data, they would rank as the 15th most common 
cause of death in the United States. 



^ Rheumatoid arthritis is an immune system 
disorder that affects more than 2 million 
Americans, causing pain, stiffness, and 
swelling in the joints. It can cripple hands, 
wrists, feet, knees, ankles, shoulders, and 







elbows. It also causes inflammation in 
internal organs and can lead to permanent 
disability. Osteoarthritis has some of the 
same symptoms, but it develops more 
slowly and only affects certain joints. 
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A fortunate discovery enabled scientists to 
design drugs that retain the anti-inflammatory 
properties of NSAIDs without the ulcer-causing 
side effects. 

By studying the drugs at the molecular level, 
researchers learned that NSAIDs block the 
action of two closely related enzymes called 
cyclooxygenases. These enzymes are abbreviated 
COX-1 and COX-2. 

Although the enzymes share some of the same 
functions, they also differ in important ways. 

COX- 2 is produced in response to injury or infection 
and activates molecules that trigger inflammation 
and an immune response. By blocking COX-2, 
NSAIDs reduce inflammation and pain caused 
by arthritis, headaches, and sprains. 

In contrast, COX-1 produces molecules, called 
prostaglandins, that protect the lining of the stom- 
ach from digestive acids. When NSAIDs block this 
function, they foster ulcers. 



Some prostaglandins 
may participate in 
memory and other 
brain functions 



Two prostaglandins 

increase blood 

flow in the kidney 



Two prostaglandins 
contract uterine muscles 
another relaxes them 




Some prostaglandins 
sensitize nerve endings 
that transmit pain signals 
to the spinal cord and brain 



Two prostaglandins relax 
muscles in the lungs; 
another contracts them 

Two prostaglandins 
' protect the lining of 
the stomach 



Some prostaglandins dilate 
small blood vessels, which 
leads to the redness and 
feeling of heat associated 
with inflammation 



Both COX-1 and COX-2 produce prostaglandins, 
which have a variety of different — and sometimes 
opposite — roles in the body. Some of these roles 
are shown here. 
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To create an effective painkiller that doesn’t 
cause ulcers, scientists realized they needed to 
develop new medicines that shut down COX-2 but 
not COX-1. Such a compound was discovered 
using standard medicinal chemistry. Searle mar- 
keted it under the name Celebrex®, and it quickly 
became the fastest selling drug in U.S. history, 



generating more prescriptions in its first year 
than the next two leading drugs combined. 

At the same time, scientists were working out 
the molecular structure of the COX enzymes. 
Through structural biology, they could see exactly 
why Celebrex® — and other so-called “super 
aspirin” drugs — plug up COX-2 but not COX-1. 

The three-dimensional structures of COX-2 
and COX-1 are almost identical. But there is one 





^ The overall structures of COX-1 and COX-2 (ribbons) 
are nearly identical, but a close-up of the active site 
reveals why Celebrex® and similar molecules can 
bind to COX-2 but not to COX-1 . A single amino acid 
substitution makes all the difference. At this one 
position, COX-2 contains valine, a small amino acid, 
while COX-1 contains isoleucine. The valine in COX-2 

Adapted with permission from Nature ©1996 Mapmill'an Magazines Ltd. 



creates a pocket into which the "super aspirin" 
drugs (in yellow) can bind. The isoleucine in 
COX-1 elbows out the drugs. Because Celebrex® 
and other "super aspirin" drugs bind only to 
COX-2 and not to COX-1, they control pain and 
inflammation without causing stomach ulcers. 
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amino acid change in the active site of COX-2 that 
creates an extra binding pocket. It is this extra 
pocket into which Celebrex® binds. 

In addition to showing researchers in atom-by- 
atom detail how the drug binds to its target, the 
structures are also greatly aiding the design 
of new, second- and third-generation drugs that 



have different properties than Celebrex® or 
work better for certain people. And of course the 
structure of the COX enzymes will continue to 
provide basic researchers with insight into 
how these molecules work in the body. 
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Valine 



Isoleucine 



What is structure-based 
drug design? 



How was structure-based 
drug design used to develop 
an HIV protease inhibitor? 



How is the structural 
difference between COX-1 
and COX-2 responsible for 
the effectiveness of 
Celebrex®? 



How do viruses become 
resistant to drugs? 
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CHAPTER 5 



Beyond Drug Design 



P 

his booklet has focused on drug design as 
J the most immediate medical application of 
structural biology. But structural biology has value 
and potential far beyond the confines of the phar- 
maceutical industry. At its root, structural biology 
teaches us about the fundamental nature of biological 
molecules. The examples below provide a tiny 
glimpse into areas in which structural biology has, 
and continues to, shed light. 



Muscle Contraction 

With every move you make, from a sigh to a sprint, 
thick ropes of myosin muscle proteins slide across 
rods of actin proteins in your cells. These proteins 
also pinch cells in two during cell division and 
enable cells to move and change shape — a process 
critical both to the formation of different tissues 
during embryonic development and to the spread 
of cancer. Detailed structures are available for both 
myosin and actin. 
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^ The structure of RNA polymerase suggests, at the molecular level, 
how it reads DNA and makes a complementary strand of RNA. 

Courtesy of Roger Kornberg, Stanford University 



TiraoiscirBptDora amid TfirarasOatfDorB 
Cells use DNA instructions to make proteins. 
Dozens of molecules (mostly proteins) cling 
together and separate at carefully choreographed 
times to accomplish this task. The structures of 
many of these molecules are known and have 
provided a better understanding of these basic 
cellular processes. One example is RNA polymerase, 
an enzyme that reads DNA and synthesizes a 



complementary strand of RNA. The enzyme is a 
molecular machine composed of a dozen different 
small proteins. The X-ray structure of RNA 
polymerase suggests a role for each of its proteins. 
The structure also reveals a pair of jaws that appear 
to grip DNA, a clamp that holds it in place, a pore 
through which RNA nucleotides probably enter, 
and grooves through which the completed RNA 
strand may thread out of the enzyme. 
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Photosynthesis 

“Photosynthesis is the most important chemical 
reaction in the biosphere, as it is the prerequisite 
for all higher life on Earth,” according to the Nobel 
Foundation, which awarded its 1988 Nobel Prize in 
chemistry to three researchers who determined the 
structure of a protein central to photosynthesis. 




a This bacterial photosynthetic reaction center was the first membrane protein 
to have its structure determined. The purple spirals (alpha helices) show where 
the protein crosses the membrane. In the orientation above, the left part of the 
molecule protrudes from the outside of the bacterial cell, while the right side is 
inside the cell. 



This protein, from a photosynthetic bacterium 
rather than from a plant, was the first X-ray 
crystallographic structure of a protein embedded 
in a membrane. The achievement was remarkable, 
because it is very difficult to dissolve membrane- 
bound proteins in water — an essential step in 
the crystallization process. To borrow further 
from the Nobel Foundation: “[This] structural 
determination . . . has considerable chemical 
importance far beyond the field of photosynthesis. 
Many central biological functions in addition 
to photosynthesis . . . are associated with mem- 
brane-bound proteins. Examples are transport 
of chemical substances between cells, hormone 
action, and nerve impulses” — in other words, 
signal transduction. 



SoginiaB TrairasducttioirD 
Hundreds, if not thousands, of life processes 
require a biochemical signal to be transmitted 
into cells. These signals may be hormones, small 
molecules, or electrical impulses, and they may 
reach cells from the bloodstream or other cells. 
Once signal molecules bind to receptor proteins 
on the outside surface of a cell, they initiate a cascade 
of reactions involving several other molecules 
inside the cell. Depending on the nature of the 
target cell and of the signaling molecule, this 
chain of reactions may trigger a nerve impulse, 
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a change in cell metabolism, or the release of 
a hormone. Researchers have determined the 
structure of some molecules involved in common 
signal transduction pathways. 

The receptor proteins that bind to the original 
signal molecule are often embedded in the cells 
outer membrane so, like proteins involved in 
photosynthesis, they are difficult to crystallize. 
Obtaining structures from receptor proteins not 
only teaches us more about the basics of signal 



transduction, it also brings us back to the 
pharmaceutical industry. At least 50 percent 
of the drugs on the market target receptor 
proteins — more than target any other type 
of molecule. 

As this booklet shows, a powerful way to 
learn more about health, to fight disease, and 
to deepen our understanding of life processes 
is to study the details of biological molecules- 
the remarkable structures of life. 







Considering this 
booklet as a whole, 
how would you define 
structural biology? 



What are the 
scientific goals of 
those in the field? 



If you were a structural 
biologist, what proteins 
or systems would you 
study? Why? 



^ Members of a family of molecules, called G proteins, 
often act as conduits to pass the molecular message 
from receptor proteins to molecules in the cell's interior. 
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Acquired immunodeficiency syndrome 
(AIDS) | A viral disease caused by the human 
immunodeficiency virus (HIV). 

Active site | The region of an enzyme to which 
a substrate binds and at which a chemical 
reaction occurs. 

AIDS | Acquired immunodeficiency syndrome — 
an infectious disease that is a major killer worldwide. 

Alpha helix | A short, spiral-shaped section 
within a protein structure. 

Amino acid | A chemical building block of 
proteins. There are 20 standard amino acids. A 
protein consists of a specific sequence of amino acids. 

Angstrom | A unit of length used for measuring 
atomic dimensions. One angstrom equals 10' 10 meters. 

Antibiotic-resistant bacteria | A strain of 
bacteria with slight alterations (mutations) in 
some of their molecules that enable the bacteria 
to survive drugs designed to kill them. 

Atom | A fundamental unit of matter. It consists 
of a nucleus and electrons. 

AZT (azido-deoxythymidine) | A drug used 
to treat HIV. It targets the reverse transcriptase enzyme. 

Bacterium (p/. bacteria) | A primitive, one-celled 
microorganism without a nucleus. Bacteria live 
almost everywhere in the environment. Some 
bacteria may infect humans, plants, or animals. 
They may be harmless or they may cause disease. 




Base | A chemical component (the fundamental 
information unit) of DNA or RNA. There are four 
bases in DNA: adenine (A), thymine (T), cytosine 
(C), and guanine (G). RNA also contains four bases, 
but instead of thymine, RNA contains uracil (U). 

Beta sheet | A pleated section within a protein 
structure. 

Chaperones | Proteins that help other proteins 
fold or escort other proteins throughout the cell. 

Chemical shift | An atomic property that varies 
depending on the chemical and magnetic properties 
of an atom and its arrangement within a molecule. 
Chemical shifts are measured by NMR spectroscopists 
to identify the types of atoms in their samples. 

COX-t (cyclooxygenase- 1) | An enzyme 
made continually in the stomach, blood vessels, 
platelet cells, and parts of the kidney. It produces 
prostaglandins that, among other things, protect 
the lining of the stomach from digestive acids. 
Because NSAIDs block COX-1, they foster ulcers. 

COX-2 (cyclooxygenase-2) | An enzyme 
found in only a few places, such as the brain and 
parts of the kidney. It is made only in response 
to injury or infection. It produces prostaglandins 
involved in inflammation and the immune response. 
NSAIDs act by blocking COX-2. Because elevated 
levels of COX-2 in the body have been linked to 
cancer, scientists are investigating whether blocking 
COX-2 may prevent or treat some cancers. 




Cyclooxygenases | Enzymes that are responsible 
for producing prostaglandins and other molecules 
in the body. 

Deoxyribose | The type of sugar in DNA. 

DIMA (deoxyribonucleic acid) | The substance 
of heredity. A long, usually double-stranded chain 
of nucleotides that carries genetic information 
necessary for all cellular functions, including 
the building of proteins. DNA is composed of 
the sugar deoxyribose, phosphate groups, and 
the bases adenine, thymine, guanine, and cytosine. 

Drug target | See target molecule. 

Electromagnetic radiation | Energy radiated 
in the form of a wave. It includes all kinds of 
radiation, including, in order of increasing energy, 
radio waves, microwaves, infrared radiation (heat), 
visible light, ultraviolet radiation, X-rays, and 
gamma radiation. 

Enzyme | A substance, usually a protein, that 
speeds up, or catalyzes, a specific chemical reaction 
without being permanently altered or consumed. 
Some RNA molecules can also act as enzymes. 

Gauss | A unit of magnetic field strength 
(also called magnetic flux density). The Earth’s 
magnetic field at its surface is approximately 
0.5 gauss. A good loudspeaker coil is on the 
order of 10,000 gauss, or 1 tesla. 

Gene | A unit of heredity. A segment of DNA 
that contains the code for a specific protein or 
protein subunit. 
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Genetic code | The set of triplet letters in DNA 
(or mRNA) that code for specific amino acids. 

HIV protease | An HIV enzyme that is required 
during the life cycle of the virus. It is required 
for HIV virus particles to mature into fully 
infectious particles. 

Human immunodeficiency virus (HIV) | 

The virus that causes AIDS. 

Inhibitor | A molecule that “inhibits,” or blocks, 
the biological action of another molecule. 

Isotope | A form of a chemical element that 
contains the same number of protons but a 
different number of neutrons than other forms 
of the element. Isotopes are often used to trace 
atoms or molecules in a metabolic pathway. In 
NMR, only one isotope of each element contains 
the correct magnetic properties to be useful. 

Kilodalton | A unit of mass equal to 1,000 daltons. 
A dalton is a unit used to measure the mass of 
atoms and molecules. One dalton equals the atomic 
weight of a hydrogen atom (1.66 x 10 24 grams). 

Lead compound | A molecule, usually a small 
one, that pharmaceutical researchers use as the 
basis for a drug. Often, the lead compound shows 
some of the desired biological activity, but it must 
be chemically altered to enhance this activity and 
to make the molecule safe and effective for delivery 
as a drug. 

MAD | See multi-wavelength anomalous diffraction . 
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Megahertz | A unit of measurement equal to 
1,000,000 hertz. A hertz is defined as one event 
or cycle per second and is used to measure the 
frequency of radio waves and other forms of 
electromagnetic radiation. The strength of NMR 
magnets is often reported in megahertz, with most 
NMR magnets ranging from 500 to 800 megahertz. 

Messenger RNA (mRNA) | An RNA molecule 
that serves as an intermediate in the synthesis of 
protein. Messenger RNA is complementary to DNA 
and carries genetic information to the ribosome. 

Molecule | The smallest unit of matter that 
retains all of the physical and chemical properties 
of that substance. It consists of one or more 
identical atoms or a group of different atoms 
bonded together. 

mRNA | Messenger RNA. 

Multi-dimensional NMR | A technique used 
to solve complex NMR problems. 

Multi-wavelength anomalous diffraction 
(MAD) | A technique used in X-ray crystallography 
that accelerates the determination of protein 
structures. It uses X-rays of different wavelengths, 
relieving crystallographers from having to make 
several different metal-containing crystals. 

NMR | Nuclear magnetic resonance. 

NMR-active atom | An atom that has the 
correct magnetic properties to be useful for NMR. 
For some atoms, the NMR-active form is a rare 
isotope, such as 13 C or 15 N. 



NOESY | Nuclear Overhauser effect spectroscopy. 

Non-steroidal anti-inflammatory drugs | 

A class of medicines used to treat pain and 
inflammation. Examples include aspirin and 
ibuprofen. They work by blocking the action 
of the COX-2 enzyme. Because they also block 
the COX-1 enzyme, they can cause side effects 
such as stomach ulcers. 

NSAIDs | Non-steroidal anti-inflammatory 
drugs such as aspirin or ibuprofen. 

Nuclear magnetic resonance (NMR) 
spectroscopy | A technique used to determine 
the detailed, three-dimensional structure of 
molecules and, more broadly, to study the physical, 
chemical, and biological properties of matter. 

It uses a strong magnet that interacts with the 
natural magnetic properties in atomic nuclei. 

Nuclear Overhauser effect spectroscopy 
(NOESY) | An NMR technique used to help 
determine protein structures. It reveals how close 
different protons (hydrogen nuclei) are to each 
other in space. 

Nucleotide | A subunit of DNA or RNA that 
includes one base, one phosphate molecule, and 
one sugar molecule (deoxyribose in DNA, ribose 
in RNA). Thousands of nucleotides join end-to-end 
to create a molecule of DNA or RNA. See base, 
phosphate group. 
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Nucleus (pL nuclei) | 1 . The membrane- 
bounded center of a cell, which contains genetic 
material. 2. The center of an atom, made up of 
protons and neutrons. 

Phosphate group | A chemical group found 
in DNA and RNA, and often attached to proteins 
and other biological molecules. It is composed of 
one phosphorous atom bound to four oxygen atoms. 

Photosynthesis | The chemical process by 
which green plants, algae, and some bacteria use 
the Suns energy to synthesize organic compounds 
(initially carbohydrates). 

Prostaglandins | A hormone-like group of 
molecules involved in a variety of functions in the 
body, including inflammation, blood flow in the 
kidney, protection of the stomach lining, blood 
clotting, and relaxation or contraction of muscles 
in the lungs, uterus, and blood vessels. The formation 
of prostaglandins is blocked by NSAIDs. 

Protein | A large biological molecule composed 
of amino acids arranged in a specific order 
determined by the genetic code and folded into 
a specific three-dimensional shape. Proteins are 
essential for all life processes. 

Receptor protein | Specific proteins found 
on the cell surface to which hormones or other 
molecules bind, triggering a specific reaction 
within the cell. Receptor proteins are responsible 
for initiating reactions as diverse as nerve impulses, 
changes in cell metabolism, and hormone release. 





Resistance | See antibiotic-resistant bacteria. 
Viruses can also develop resistance to antiviral drugs. 

Retrovirus | A type of virus that carries its 
genetic material as single-stranded RNA, rather 
than as DNA. Upon infecting a cell, the virus 
generates a DNA replica of its RNA using 
the enzyme reverse transcriptase. 

Reverse transcriptase | An enzyme found in 
retroviruses that copies the virus’ genetic material 
from single-stranded RNA into double-stranded DNA. 

Ribose | The type of sugar found in RNA. 

Ribosomal RNA | RNA found in the ribosome. 

RNA (ribonucleic acid) | A long, usually 
single-stranded chain of nucleotides that has 
structural, genetic, and enzymatic roles. There are 
three major types of RNA, which are all involved 
in making proteins: messenger RNA (mRNA), 
transfer RNA (tRNA), and ribosomal RNA 
(rRNA). RNA is composed of the sugar ribose, 
phosphate groups, and the bases adenine, uracil, 
guanine, and cytosine. Certain viruses contain 
RNA, instead of DNA, as their genetic material. 

Side chain | The part of an amino acid that 
confers its identity. Side chains range from a single 
hydrogen atom (for glycine) to a group of 15 or 
more atoms. 

Signal transduction | The process by which 
chemical, electrical, or biological signals are 
transmitted into and within a cell. 
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Structural biology | A field of study dedicated 
to determining the detailed, three-dimensional 
structures of biological molecules to better 
understand the function of these molecules. 

Structural genomics | A field of study that seeks 
to determine a large inventory of protein structures 
based on gene sequences. The eventual goal is to 
be able to produce approximate structural models of 
any protein based on its gene sequence. From these 
structures and models, scientists hope to learn 
more about the biological function of proteins. 

Structure-based drug design | An approach 
to developing medicines that takes advantage of the 
detailed, three-dimensional structure of target 
molecules. 

Substrate | A molecule that binds to an enzyme 
and undergoes a chemical change during the 
ensuing enzymatic reaction. 

Synchrotron | A large machine that accelerates 
electrically charged particles to nearly the speed 
of light and maintains them in circular orbits. 
Originally designed for use by high-energy physicists, 
synchrotrons are now heavily used by structural 
biologists as a source of very intense X-rays. 

Target molecule (or target protein) | The 

molecule on which pharmaceutical researchers 
focus when designing a drug. Often, the target 
molecule is from a virus or bacterium, or is 



an abnormal human protein. In these cases, 
the researchers usually seek to design a small 
molecule — a drug — to bind to the target molecule 
and block its action. 

Tesla | A unit of magnetic field strength (also called 
magnetic flux density). A field of 1 tesla is quite 
strong; the largest NMR magnets are approximately 
20 teslas. One tesla equals 10,000 gauss. 

Transcription | The first major step in protein 
synthesis, in which the information coded in DNA 
is copied (transcribed) into mRNA. 

Translation | The second major step in protein 
synthesis, in which the information encoded in 
mRNA is deciphered (translated) into sequences of 
amino acids. This process occurs at the ribosome. 

Virus | An infectious microbe that requires a host 
cell (plant, animal, human, or bacterial) in which 
to reproduce. It is composed of proteins and 
genetic material (either DNA or RNA). 

Virus particle | A single member of a viral strain, 
including all requisite proteins and genetic material. 

X-ray crystallography | A technique used to 
determine the detailed, three-dimensional structure 
of molecules. It is based on the scattering of X-rays 
through a crystal of the molecule under study. 
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